168 PART 4 Comparing Groups
»
» It’s flexible! The test works for tables with any number of rows and columns,
and it easily handles cell counts of any magnitude. Statistical software can
usually complete the calculations quickly, even on big data sets.
But the chi-square test has some shortcomings:»
» It’s not an exact test. The p value it produces is only approximate, so using
p
0 05
.
as your criterion for statistical significance (meaning setting α = 0.05)
doesn’t necessarily guarantee that your Type I error rate will be only
5 percent. Remember, your Type I error rate is the likelihood you will claim
statistical significance on a difference that is not true (see Chapter 3 for an
introduction to Type I errors). The level of accuracy of the statistical signifi-
cance is high when all the cells in the table have large counts, but it becomes
unreliable when one or more cell counts is very small (or zero). There are
different recommendations as to the minimum counts you need per cell in
order to confidently use the chi-square test. A rule of thumb that many
analysts use is that you should have at least five observations in each cell of
your table (or better yet, at least five expected counts in each cell).»
» It’s not good at detecting trends. The chi-square test isn’t good at detecting
small but steady progressive trends across the successive categories of an
ordinal variable (see Chapter 4 if you’re not sure what ordinal is). It may give a
significant result if the trend is strong enough, but it’s not designed specifically
to work with ordinal categorical data. In those cases, you should use a
Mantel-Haenszel chi-square test for trend, which is outside the scope of this
book.
Modifying the chi-square test: The
Yates continuity correction
There is a little drama around the original Pearson chi-square of association test
that needs to be mentioned here. Yates, who was a contemporary of Pearson,
developed what is called the Yates continuity correction. Yates argued that in the
special case of the fourfold table, adding this correction results in more reliable
p values. The correction consists of subtracting 0.5 from the magnitude of the
(Ob
Ex
–
) difference before squaring it.
Let’s apply the Yates continuity correction for your analysis of the sample data in
the earlier section “Understanding how the chi-square test works.” Take a look at
Figure 12-3, which has the differences between the values in the observed and
expected cells. The application of the Yates correction changes the 7.20 (or –7.20)
difference in each cell to 6.70 (or –6.70). This lowers the chi-square value from